HDFS-17293. First packet data + checksum size will be set to 516 bytes when writing to a new block.#6368
Conversation
…s when writing to a new block.
|
@Hexiaoqiao @tomscut @ayushtkn @zhangshuyan0 Sir, could you please help me review this code when you have free time?Thanks ahead. |
|
💔 -1 overall
This message was automatically generated. |
...op-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSOutputStream.java
Show resolved
Hide resolved
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
@Hexiaoqiao @zhangshuyan0 @slfan1989 Sir, have updated this PR, added an unit test. could you please help me review this pr when you are free? Thanks~ |
zhangshuyan0
left a comment
There was a problem hiding this comment.
Great catch! Leave some small suggestions.
| fos.write(buf); | ||
| fos.hflush(); | ||
| loop++; | ||
| Assert.assertNotEquals(crc32c.getBytesPerChecksum() + crc32c.getChecksumSize(), |
There was a problem hiding this comment.
It is more appropriate to precisely specify the expected packetSize here.
Outside the while loop:
int chunkSize = crc32c.getBytesPerChecksum() + crc32c.getChecksumSize();
int packetContentSize = (dfsConf.getInt(DFS_CLIENT_WRITE_PACKET_SIZE_KEY, DFS_CLIENT_WRITE_PACKET_SIZE_DEFAULT) - PacketHeader.PKT_MAX_HEADER_LEN)/chunkSize*chunkSize;
And here:
Assert.assertEquals(((DFSOutputStream) fos.getWrappedStream()).packetSize, packetContentSize);
There was a problem hiding this comment.
Sir, thanks for this valuable suggestion. Will fix it soon.
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSOutputStream.java
Outdated
Show resolved
Hide resolved
|
This PR has corrected the size of the first packet in a new block, which is great. However, due to the original logical problem in Line540, when we pass blockSize - getStreamer().getBytesCurBlock() to computePacketChunkSize as the first parameter, computePacketChunkSize is likely to cause the data that could have been sent in one data packet to be split into two data packets and sent.
|
0a43db0 to
af8a23b
Compare
|
@zhangshuyan0 Sir, Have modified according to your review opinions. Please take a look when you have free time. Thanks a lot~ |
|
💔 -1 overall
This message was automatically generated. |
af8a23b to
da1ea49
Compare
da1ea49 to
68292c7
Compare
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Sir, very nice catch. I think below code may resolve the problem you found. Please take a look when you are free. if (!getStreamer().getAppendChunk()) {
int psize = 0;
if (blockSize == getStreamer().getBytesCurBlock()) {
psize = writePacketSize;
} else if (blockSize - getStreamer().getBytesCurBlock() + PacketHeader.PKT_MAX_HEADER_LEN
< writePacketSize ) {
psize = (int)(blockSize - getStreamer().getBytesCurBlock()) + PacketHeader.PKT_MAX_HEADER_LEN;
} else {
psize = (int) Math
.min(blockSize - getStreamer().getBytesCurBlock(), writePacketSize);
}
computePacketChunkSize(psize, bytesPerChecksum);
} |
68292c7 to
96da2c8
Compare
Thank you very much for investing your time in fixing these bugs. The above fixes did not take |
@zhangshuyan0 Sir, Agree with you, let's discuss this issue in the new PR. The failed tests were all passed in my local. |
|
Committed to trunk. Thanks for your contribution @hfutatzhanghb . |
@zhangshuyan0 Sir, Thanks a lot for your reviewing. |
|
🎊 +1 overall
This message was automatically generated. |
…s when writing to a new block. (apache#6368). Contributed by farmmamba. Reviewed-by: He Xiaoqiao <hexiaoqiao@apache.org> Signed-off-by: Shuyan Zhang <zhangshuyan@apache.org>
Description of PR
First packet size(data length + checksum length) will be set to 516 bytes when writing to a new block. It is an unnecessary and wrong behavior.
In method computePacketChunkSize, the parameters psize and csize would be (0, 512)when writting to a new block.
It should better use writePacketSize to improve the speed of sending data.